Relative Rank Statistics for Dialog Analysis
نویسنده
چکیده
We introduce the relative rank differential statistic which is a non-parametric approach to document and dialog analysis based on word frequency rank-statistics. We also present a simple method to establish semantic saliency in dialog, documents, and dialog segments using these word frequency rank statistics. Applications of our technique include the dynamic tracking of topic and semantic evolution in a dialog, topic detection, automatic generation of document tags, and new story or event detection in conversational speech and text. Our approach benefits from the robustness, simplicity and efficiency of non-parametric and rank based approaches and consistently outperformed term-frequency and TF-IDF cosine distance approaches in several experiments conducted.
منابع مشابه
Bootstrap and fast double bootstrap tests of cointegration rank with financial time series
The likelihood ratio test of cointegration rank is the most widely used test for cointegration. Many studies have shown by simulation that the small sample distribution is not well approximated by the limiting distribution. We suggest using the bootstrap to generate small sample critical values instead of correcting the test statistics. The idea of bootstrapping the trace test of cointegration ...
متن کاملHypotheses ranking for robust domain classification and tracking in dialogue systems
We present a novel application of hypothesis ranking (HR) for the task of domain detection in a multi-domain, multiturn dialog system. Alternate, domain dependent, semantic frames from a spoken language understanding (SLU) analysis are ranked using a gradient boosted decision trees (GBDT) ranker to determine the most likely domain. The ranker, trained using Lambda Rank, makes use of a range of ...
متن کاملSemantic tokenization of verbalized numbers in language modeling
In spoken dialog systems, number strings frequently carry crucial information such as DATE, TIME, and PRICE. Yet numbers are inherently difficult to recognize, partly because reliable statistics for training a language model is hard to obtain. In this paper, we take the advantage of the fact that dialog systems perform some form of semantic parsing. We use this parsing information to distinguis...
متن کاملRank tests and regression rank score tests in measurement error models
The rank and regression rank score tests of linear hypothesis in the linear regressionmodel are modified for measurement error models. The modified tests are still distribution free. Some tests of linear subhypotheses are invariant to the nuisance parameter, others are based on the aligned ranks using the R-estimators. The asymptotic relative efficiencies of tests with respect to tests in model...
متن کاملEmpirical Methods for Evaluating Dialog Systems
We examine what purpose a dialog metric serves and then propose empirical methods for evaluating systems that meet that purpose. The methods include a protocol for conducting a wizard-of-oz experiment and a basic set of descriptive statistics for substantiating performance claims using the data collected from the experiment as an ideal benchmark or “gold standard” for comparative judgments. The...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008